Abstract 



This paper introduces Markov chains and processes over nonabelian free groups 
and semigroups. We prove a formula for the /-invariant of a Markov chain over a 
free group in terms of transition matrices that parallels the classical formula for the 
entropy a Markov chain. Applications include free group analogues of the Abramov- 
Rohlin formula for skew-product actions and Yuzvinskii's addition formula for algebraic 
actions. 
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1 Introduction 

A (classical) Markov chain is an N or Z-indexed family of random variables (Xj) each taking 
values in a set K (called the state space) and satisfying the following condition: for any 
i G N or Z and fcj+i, k{, . . . G K 

Pr(X i+ i = k i+ i\Xi = ki) = Pr(X i+ i = k i+ i \ Xj = ^,Xj_i = A; i _ 1 , . . .) = M kuki+1 

where M is a fixed K x K matrix called the transition matrix. We will always assume that 
K is at most countable and has the discrete topology. 

These objects can be viewed from an ergodic theory perspective as follows. Let K G 
denote the set of all functions G — > K where G equals Z or N. Let /i be the probability 
measure on K G defined by setting fi{E) equal to the probability that the Markov chain (Xj) 
(considered as a function from G to K) is in E. Let a be the "time partition" defined 
by a = {A k : k G K} where A k = {x : G -> K : x(0) = k}. Let a : K G -> K G be 
the shift-operator, defined by a(x)(n) = x{n + 1). We call the quadruple (a, K G , fi,a) a 
Markov process. 

When fj, is shift-invariant, it satisfies several nice properties. First, the entropy rate 
h(a,fi) equals H(a\<j~ l a) where H(-) is Shannon's entropy (see §1.11 for the definition). In 
fact, this property characterizes Markov processes. Second, the space of all n-step Markov- 
processes (which are generalizations of the above) is dense in the space of shift-invariant 
Borel probability measures on K with the weak* topology. 

* email : lpbowen@math . hawaii . edu 
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The purpose of this paper is to build an analogous theory when G is a free group or 
semigroup and entropy is replaced with the /-invariant. The latter is a measure- conjugacy 
invariant that generalizes Kolmogorov-Sinai entropy. It was introduced in [Bo08a]. For the 
reader's convenience, the /-invariant is defined in §1.21 and (most of) the proof that it is a 
measure-conjugacy is recalled in §5j 

The definition of Markov chain over a free group is similar to the definition of tree- indexed 
Markov chain (see [Pe95] and the references therein). The primary novelty here is that we 
study ergodic-theoretic aspects and especially relationships with entropy theory. 

The notion of a Markov chain is connected with the notion of a "past". For example, 
the past of an element n G Z is the set Z n (— oo,n). A stationary Markov chain (JQ)^ e 2 is 
characterized by the property that the distribution of Xi conditioned on Xi-i (its immediate 
past) equals the distribution of Xi conditioned on Xj for all j in the past of i and that this 
conditional distribution is independent of i. 

If G = (si, . . . , s r ) is a free group then every element g G G has 2r different "pasts", 
corresponding to the given generators. For example, the past of an element g in the direction 
of s G {sf 1 , . . . , sf 1 } is the set of all elements of the form fsg where / G G, \fsg\ = \ f\ + \sg\ 
and | ■ | denotes the word-metric on G. A Markov chain over G is a G-indexed family of 
random variables (X g ) ge c such that the distribution of X g conditioned on X sg equals the 
distribution of X g conditioned on Xh for all h in the past of g in the s-direction and that 
this conditional distribution is independent of g. 

To view this from this ergodic theory perspective, for g G G, let T g : K G — > K G be the 
shift-operator defined by T g (x)(f) = x(fg). Let fi be the measure on K G defined by f-i(E) 
equals the probability that the Markov chain (X g ) ge c considered as a function from G to K is 
in E. Let a be the "time e-partition" : a = {Ak : k G K} where A k = {x & K : x(e) = k}. 
The action G rx T (K G ,fi) with the partition a is a Markov process. A more general and 
precise definition is in §|3 

In the classical case, stationary Markov chains can be easily constructed in terms of 
transition matrices and stationary vectors. We show that there is an analogous construction 
in the case of free groups. This should be useful to the study of the classification problem 
for dynamical systems over free groups up to isomorphism. For example, it is shown in 
§8.31 that there is mixing Markov chain that is not isomorphic to any Bernoulli shift. This 
contrasts with the Friedman-Ornstein theorem [FO70] that every mixing Markov chain over 
the integers is isomorphic to a Bernoulli shift. We also exhibit examples of Markov chains 
related to well-studied problems in the theory of random regular graphs. 

Assume now that (X g ) ge a is a stationary Markov chain. This implies \i is shift-invariant. 
We will show that the /-invariant of the system G rx T (K G ,fi) equals F(/j,,a) := (1 — 
2r)H(a) + YH=i H(a \/T~ l a). Indeed, this condition characterizes Markov processes. Since 
f(fi,a) < F(fi,a) holds in general, it follows that for every shift-invariant Borel probability 
measure v on K G that equals \i on the partitions a V T'^a, f(u, a) < /(//, a) with equality 
if and only if /i = v. In brief: the /-invariant is uniquely maximized on Markov chains. 
Moreover, there is a precise sense in which every process over G can be approximated by a 
sequence of "higher-step" Markov processes. These tools are used to prove analogues of two 
classical results: the Abramov-Rohlin formula and Yuzvinskii's addition formula. To explain 
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these results precisely, let us review the definitions of entropy and the /-invariant next. 
1.1 Classical results 

Let (X, B, n) be a probability space. Let T : X — > X be a measure-preserving transformation. 
We use a, /3 to denote measurable partitions of X into at most countable many subsets. The 
join of a and /3 is their common refinement, defined by a V f3 := {A flB : A E a, B E [3}. 
The entropy of a is 

ff(a):=-5>(^)k>gM^))- 

We will need a relative version of this quantity as well. So let T C £> be a sub-cr-algebra. 
Given A <E B, let /x(A|jF) be the conditional expectation of the characteristic function xa of 
A with respect to T . The conditional information function l{a\T) : X — > R is defined 
by 

/(al^^^-log^KI^)^)) 
where is the atom of ct containing x. The entropy of a conditioned on JF is 

H(a\F) := [ I(a\F)(x)dii(x). 

The mean entropy of a given T with respect to T is 

1 n 

If JF is T-invariant (i.e., if T~ X A e JF VA e JF) then this limit exists. It is well-known that 

oo 

/i(T,a|^) = H(a\Fv\J T^a}. 

i=i 

Define /i(T|jF) := sup a /i(T, a|jF) where the supremum is over all partitions a with if (a|jF) < 
oo. Let t = {X, 0} be the minimal a-algebra and let h(T, a) = h(T, a\r) and h(T) := h(T\r). 
When it is helpful to emphasize the measure we will write h(T, /i, a) for h(T, a). 
We will generalize the next two theorems to actions of free groups. 

Theorem 1.1 (The Abramov-Rohlin Formula). If a and f3 are any two measurable partitions 
with H(a) + H((3) < oo then 

h(T, a V (3) = h(T, a) + h(T, (3\a T ). 

The original Abramov-Rohlin formula, proven in [AR62], was stated in terms of skew- 
products. The version above is due to Bogenschutz and Crauel [BC92]. This formula was 
generalized in [WZ92] to amenable group actions. See [DaOl] for an alternative proof using 
orbit equivalence theory. 
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Theorem 1.2 (Yuzvinskii's Addition Formula). Let Q be a separable compact group, T : 
Q — > Q a Haar measure-preserving homomorphism and J\f < Q a closed normal T -invariant 
subgroup. Let Tg/^j- : Q/H — > Q/Af be the induced homomorphism and TV : H — > H the 
restriction ofT to Af . Then 

h{T)=h{T g/M )+h{T M ) 

where each entropy rate is with respect to the Haar probability measure on Q,Q/Af and M 
respectively. 

This was proven first in [Yu65]. R. K. Thomas [Th71] enhanced this formula to skew- 
product actions. In [LSW90] it was generalized to actions of Z d . There are related results 
in [LSW90, De06, DS07, BM08]. In a very recent preprint [LS09], Lind and Schmidt have 
extended Yuzvinksii's formula to all algebraic actions of an arbitrary amenable group. 

1.2 Free groups and semigroups 

From now on, let G = (s±, . . . , s r ) denote either a free group or a free semigroup with identity. 
If G is a group, let S = {sf 1 , . . . , sf 1 }. In the semigroup case, let S = {si, . . . , s r }. Let 
| • | : G — > R denote the word metric with respect to S. 

We will write G rx T (X, B, fi) to denote that T : G — > End(X, £>, fx) is a homomorphism 
from G into the semigroup of measure-preserving transformations of (X, £>, /i) which we 
will always assume is a standard probability space. Measure-preserving means that for all 
g G G and E G £>, fiiT^E) = fi(E). If a is a partition of X and Q C G is finite, then 
oft := V qe QT q ' 1 a. To simplify notation, let a n := a B( - e ' n ^ where B(e,n) C G is the ball of 
radius n centered at the identity element with respect to the word metric. Define 

r 

F(T,a) := (1 - 2r)H(a) + £ H(a V T s » 

i=i 

f(T,a) := inf F(T,a B ). 

n>0 

In [Bo08a], it is proven that if a generates (i.e., the smallest G-invariant a-algebra containing 
a, denoted a G , equals B up to sets of measure zero) and if (3 also generates and H(a)+H(/3) < 
oo then f(T,a) = f(T,/3). This common number is called the /-invariant of the action 
(denoted f(T)). It is a measure-conjugacy invariant. It is our substitute for entropy rate. 
Unlike the classical case, f(T) is well-defined only if there exists a generating partition a 
with H(a) < oo. Also f(T) can take negative values. 

We will need the following relative versions. If T C B is a sub-a-algebra then define 
and T n similarly to the above and let 

r 

F(T,a\F) := (1 — 2r)H(a\J r ) + ^ H(a V T^a]? 7 V T^F) 

i=i 

f(T,a\F) := inf F(T,a n \F n ). 
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When T is fixed we will write f(a\J 7 ) instead of f(T, a\T). When it is helpful to emphasize 
the measure we will write f(T, fi, a\J-) instead of f(T, a\T). 
The next theorem generalizes Abramov-Rohlin's formula. 

Theorem 1.3. Let G rx T (X,B,[i). If a and (3 are partitions of X with H(a) + H(f3) < oo 
then 



To illustrate, a simple calculation shows that if X that has exactly n elements and /i is 
the uniform probability measure on X then f(T) = (1 — r) log(n). Note this is negative if 
n > 1 and r > 1. The above theorem and standard skew-product arguments now imply: 

Corollary 1.4. Let G rx T (X, B, ji) be an ergodic G -system. Let G rx s (Y, C, u) and suppose 
there is a n-to-1 factor map : X — > Y (i.e., <fi*fi = v, <t>{T g x) = S g (f>{x) for a.e. x and 
\<j)~ l {y)\ = n for a.e. y). Then 



whenever f(S) and f(T) are well-defined. 

The next result generalizes Yuzvinskii's addition formula. 

Theorem 1.5. Let G = (si, . . . , s r ) be a rank r free group or semigroup. Let Q be a separable 
compact group which is either totally disconnected, a Lie group, or a finite- dimensional 
connected abelian group. Let Tg : G — > End(C}) be a homomorphism and let H < Q be a 
closed normal G-invariant subgroup. Let TV : G — > End{H) and Tg/j^- : G — > End{Q/N) be 
the induced homomorphisms. Then 



whenever f(Tg),f(Tg/j^) and /(TV) are well-defined. The numbers f{Tg),f{Tg/^) and 
f(Tjv) are computed with respect to Haar probability measure on Q,Q/H and Af respectively. 

I conjecture that the above result holds for all separable compact groups Q. In [E199] 
it was proven that there is no invariant for nonabelian free group actions (and many other 
nonamenable groups) that satisfies a Yuzvinskii-type formula under some rather general 
assumptions on the invariant. But the /-invariant does not satisfy these because it can take 
negative values. 

To illustrate, let us recall the following example from [OW87]. Let G = («i,S2) be the 
rank 2 free group. Let Q = (Z/2Z) G be the set of all functions from G — > Z/2Z. It is a 
group under pointwise addition. It can be considered as the product of G copies of Z/2Z. 
By Tychonoff's theorem, it is compact. Let Af < Q the subgroup of constant functions. So 
\J\f\ = 2. For g E G, define T g : Q — > Q by T g x(f) = x(g' 1 f). This action preserves Haar 
measure on Q and leaves M invariant. 



In [OW87], it is pointed out that QjM is isomorphic to Q x Q ^ (Z/2Z x Z/2Z) G . Indeed 
the factor map <p : (Z/2Z) G -> (Z/2Z x Z/2Z) G defined by 



f(T,a\f[3) = f(T,a) + f(T,[3\a G ). 



f(S) 



(r - 1) log(n) + f(T) 



f(T g ) = f(T g/M ) + /(Tjv) 



^{x){g) = (x(g) + x(sig),x(g) + x 



{S29)) 
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defines an isomorphism Q/Af = (Z/2Z x Z/2Z) G = Q x Q. So, the above theorem implies 
that 

f(Tg)=f(T„)+f(Tg x g). 

This is easily verified. Tg and Tg x g are both Bernoulli shift actions. From one of the main 
results of [Bo08a], it follows that f(Tg) = log(2) and f(Tg x g) = log(4). The action of G on 
M is trivial and it is easy to verify that /(?V) = — log(2) as required. Alternatively, since 
the above factor map is 2-1, this formula can be derived from corollary 11.41 

1.3 An alternative formulation of the /-invariant 

We will prove the following formula for the /-invariant that helps enable the transfer of 
results from the classical case to the case of free groups. To explain, let G r\ T (X,B,fj), 
T C B be a T-invariant sub-cr-algebra and a be a partition of X. Define 

r 

F m (T,a\F) := (1 - r)H(a\T) + J] h(T H , a\T). 

i=l 

f*{T,a\F) ■= inf F*(T,a n \F). 

n>0 

In $9]we prove that f*(T,a\J-) = f(T,a\J-). 

1.4 Organization 

§7.21 explains notation used throughout the paper. ||3] is a review of classical entropy theory. 
§U is a study of the space of partitions of X. $5] introduces /* and proves that / and /* 
are measure- conjugacy invariants (using the main theorem of §1]). $6] introduces Markov 
processes and proves that F(a>) = f(a) for such processes. JT] develops a constructive 
approach to Markov processes via transition matrices and symbolic dynamics. §H] presents 
three examples of Markov processes. §H1 proves that / = /* using Markov approximations to 
an arbitrary system. This is then used to give a short proof of theorem [L3] in CU3l £fTT1 proves 
that if a process (T, X, //, a) satisfies F(a) = f{a) then it must be Markov. £fT2l proves more 
approximation results that are used in £fT3lto prove theorem 11.51 

Acknowledgements. I would like to thank Russ Lyons for suggesting that I think 
about the isomorphism problem for Bernoulli shifts over a nonabelian free group and for 
many useful conversations along the way. I'd also like to thank Benjy Weiss for asking 
whether the infinite entropy Bernoulli shift over a nonabelian free group could be finitely 
generated. That question is answered in [Bo08b] and a different proof is provided in CT2l 

2 Notation 

In general, G := (s\, . . . , s r ) denotes either a free group or free semigroup with 1. If G is a 
group, let S = {s^ 1 , . . . , sf 1 }. In the semigroup case, let S = {sx, . . . , s r }. 
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We will write G r\ T (X, B, /i) to denote that T : G — > End(X, B, /i) is a homomorphism 
from G into the semigroup of measure-preserving transformations of (X, B, fi) which we will 
always assume is a standard probability space. Measure-preserving means that for all g G G 
and E G B, ^(T^E) = fu,(E). When convenient we will ignore the a-algebra by writing 
G r\ T (X, jj) instead. The triple (T, X, fx) is a called a G-system or an action of G. 

We use a, (3 to denote partitions of X into at most countably many measurable subsets. 



3 Review of classical entropy theory 

Fix a probability space (X, B, fi). 

Definition 1. A partition a = {Ai, A 2 , . . .} is a pairwise disjoint collection of measurable 
subsets Ai of X such that UjAj = X. The sets Ai are called the partition elements of a. 
Alternatively, they are called the atoms of a. Unless stated otherwise, all partitions in this 
paper are either finite or countable infinite. 

Definition 2. If a and (3 are partitions of X then the join of a and (3 is the common 
refinement partition a V (3 = {A n B \ A G a, Be (3}. By abuse of notation, we will 
sometimes identify a join with the a-algebra that it generates. Thus if «i, a 2 , ... is a sequence 
of partitions then V^i a * * s identified with the smallest cr-algebra of X that contains every 
atom of ai for all i. 

Definition 3. The information function 1(a) : X — > R corresponding to a partition a is 
defined by 

/(«)(*) = -iog(M^)) 

where A x is the atom of a containing x. 
Definition 4. The entropy H(a) of a is defined by 

H(a) = - J>(A)log(/x(,4)) = f I(a)(x)dLi(x). 

Aea Jx ^ X 

By convention log(O) = 0. 

Definition 5. Let G be a group (or semigroup with 1) acting on (X,B,fi). Let a be a 
partition. Let a G be the smallest G-invariant cr-algebra containing the atoms of a. Then 
a is generating (with respect to the given action of G) if for every measurable set A C X 
there exists a set A' G a G such that fi(AAA') = 0. 

Definition 6. Let T : X — > X be a measure-preserving transformation. The mean entropy 
of a partition a of X is 

h(T,a) := lim H ^^ T lg) = i im H{T~ n - l a\\J T^a). 

j=0 
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A. N. Kolmogorov proved [Ko58, Ko59] that if a and /3 are finite-entropy generating parti- 
tions then h(T,a) = h(T,/3). Y. Sinai proved [Si59] that if a is any finite-entropy partition 
and j3 is generating partition then h(T,a) < h(T,/3). So the entropy of the system is 
defined by h{T) := sup a h(T, a) where the sup is over all finite-entropy partitions a. This 
defines an isomorphism invariant of the system (T, X, jj) . 

Definition 7. Let J 7 be a a-algebra contained in the a-algebra of all measurable subsets of 
X. Given a partition a, define the conditional information function I(a\J-) : X — > R 
by 

I(a\F){x) = -log {ii{A x \F){x)) 

where A x is the atom of a containing x. Here, if A C X is measurable then /x(A|JF) : X — > K 
is the conditional expectation of xa, the characteristic function of A, with respect to the 
cr-algebra T . The conditional entropy of a with respect to JF is defined by 

H(a\F) = [ I{a\T){x)d^{x). 
Jx 

If (3 is a partition then, by abuse of notation, we can identify (3 with the cr-algebra equal 
to the set of all unions of partition elements of (3. Through this identification, I(a\/3) and 
H(a\f3) are well-defined. 

Definition 8. Let T : X — > X be a measure-preserving transformation. If JF c B is a 
T-invariant sub-a-algebra then the entropy rate of a conditioned on T is 

h(T, a\T) := lim - — ~h(\J T^ot^ = lim H {r^a^ V \J T^aj . 

Lemma 3.1. For any two partitions a, (3 and for any two a-algebras T\,Ti with T\ C T2, 

H(aW(3) =H(a) + H(J3\a), 
H(a\F 2 ) < H{a\J' 1 ) 

with equality if and only if n{A\T2) = ^(^1^) a.e. for every A G a. In particular H(a\j3) < 
H(a) and equality occurs iff a and (3 are independent (i.e., VA G a,B £ j3, /i(An B) = 
fi(AMB)). 

Proof. This is well-known. For example, see [G103, Proposition 14.16, page 255]. □ 

4 The space of partitions 

Let G rx T (X,B,fi). Let V be the set of all partitions a of X such that H(a) < 00. We 
identify partitions if they agree up to measure zero. The main theorem below is needed to 
prove that the /-invariant is a measure-conjugacy invariant (which is concluded in £J5]). The 
splittings concept introduced below will be useful in our study of Markov processes. 
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Definition 9 (Rohlin distance). Define d : V x V -> R by 

d(a, /3) = #H/3) + if = 2H(a V 0) - H(a) - H{(3). 

By [Pa69, theorem 5.22, page 62] this defines a distance function on V. The action of G on 
V is isometric. I.e., if g G G, a, (3 G P then d{T~ 1 a,T~ 1 f3) = d(a,j3). 

Definition 10. Let a and /3 be partitions. If, for every atom A G a there exists an atom 
B G (3 such that — B) — (i.e., A C B up to a measure zero set) then we say a refines 
j3. Equivalently, (3 is a coarsening of a. This is denoted by (3 < a. 

Definition 11. If a is a partition of X and Q C G is finite then let = \f q£ QT~ 1 a. Two 
partitions a, (3 E V are equivalent if there exists finite sets Q,P C G such that a < f3 p 
and /3 < a Q . 

Theorem 4.1. If a, (3 G P are generating partitions and e > then there exists a 7 G V 
that is equivalent to a such that d(j,(3) < e. In other words, the equivalence class of a is 
dense in the space of all generating partitions. 

For a proof, we refer the reader to [Bo08b] . The notation there differs from the notation 
here in one respect: a Q is defined to be V^gq T q a. Also, only groups, rather than semigroups 
are treated in [Bo08b] . However the proof requires only minor obvious changes to extend it 
to the semigroup case. 

4.1 Splittings 

Let us assume now (and for the rest of the paper) that G — (s±, . . . , s r ) is a free group or 
free semigroup with 1. If G is a group then let S = {si, . . . , s r , sj -1 , . . . , s^ 1 }. If G is only a 
semigroup, let S = {si, . . . , s r }. Let G rx T (X, £>, //) 

Definition 12. Let a be a partition. A simple splitting of a is a partition a of the form 
a = a V T~ l f3 where s G S and (3 is a coarsening of a. 

A splitting of a is any partition a that can be obtained from a by a sequence of simple 
splittings. In other words, there exist partitions oto,ai, . . . ,a m such that cto = a, a m = a 
and ctj+i is a simple splitting of cej for all 1 < i < m. 

Remark 1. In [Bo08b], an ^'-splitting of a is defined to be a partition a of the form o = a\JT s f3 
for s G S. The definition given above is necessary to accommodate the case when G is merely 
a semigroup. 

Definition 13. The right-Cayley graph T of (G, S) is defined as follows. The vertex set 
of T is G. For every s G S and every g G G there is a directed edge from g to gs labeled s. 
There are no other edges. 

The induced right-subgraph of a subset F C G is the largest subgraph of T with 
vertex set F. A subset F C G is right-connected if its induced right-subgraph in T is 
connected. 
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Lemma 4.2. If a, (3 G V , a refines (3 and F C G is finite, right- connected and contains the 
identity element e then 

a V \/ Tj x j3 
feF 

is a splitting of a. 

Proof. We prove this by induction on \F\. If \F\ = 1 then F = {e} and the statement is 
trivial. Let fo G F — {e} be such that F\ = F — {f } is right-connected. To see that such 
an /o exists, choose a spanning tree for the induced right-subgraph of F. Let f be any leaf 
of this tree that is not equal to e. 

By induction, a.\ := a V V/eFi ^/ ^ s a splitting of a. Since F is right-connected, there 
exists an element fi G F\ and an element s\ G S 1 such that /isi = /o- Since f\ G Fi, «i 
refines Tj_^(3. Thus 

a V \/ T/ 1 /? = ai V T^ 1 /? = a, V T^{T^/3) 

is a splitting of a. □ 

To ease notation, let a n = a B<ye,n ^ where B(e,n) denotes the ball of radius n centered at 
the identity element in G with respect to the word metric induced by S. 

Proposition 4.3. Let a, (3 G V . Suppose there are n,m G N such that a < (3 n < a m . Then 
a m is a splitting of fi. 

Proof. By the previous lemma, (3 n V a m = a m is a splitting of (3. □ 



5 An alternative formula for the /-invariant 

Recall the definitions of F and F* from the introduction. We will write F{a\J r ) for F(T, a\T) 
when T is clear. Similar statements apply to f(a\J 7 ), F^{a\T), etc. 

Proposition 5.1. Let G rx T (X,B,fi). If T C B is any T(G) -invariant a-algebra, a is any 
partition with H(a) < oo, and a is any splitting ( definition flffj) of a then F(cr|jF) < F(a\J r ) 
andF*{o-\F) < F*{a\F). 

Proof. It suffices to consider the case in which a is a simple splitting. So, there exists t G S 
and a coarsening (5 of a such that a = a V Tf 0. We will assume that t G {si, . . . , s r }. The 
proof in the case that t G {sf 1 , . . . , s" 1 } is similar. Using lemma IXTl it follows that 

r 

F(a\T) = F(a|F) + (l-2r)F(a|aVJ !r ) + ^iJ((TVT-V|aVT s : 1 aVJ !r ). 

8=1 

Note that 

H(a V T^Vja V T~ 1 a V JF) < V T^ct V J 17 ) + H{T~ l a\a V T s _1 a V J=). 
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Since a is refined by a V T t a it follows that H(a\a V T t 1 a V J 7 ) = 0. Without loss of 
generality, t — s r . Thus, 

r-1 

F(a\J 7 ) < F(a\F)+ (y^H^laVT^aV J 7 ) + H^^aV T^aV F)-2H(a\aV J 7 )") 

i=i 

+H(T~ 1 a\a V T^a V T) - H{a\a V J 7 ). 

Since H(a\a V T^a V JF) < F((r|a V .F) and 

#(T~V|a V T^a V J 7 ) < H{T^a\T^a V J 7 ) = H(a\a V JF) 

it follows that F(cr|.F) < F(a|.F) as claimed. 

The proof in the case of F* is similar. By a well-known relative version of theorem II. 1[ 
if s G S then 

/i(T s , a\T) = h(T s , a\T) + h(T s , a\a s V JF) 
where a s is the smallest T s -invariant cr-algebra containing a. As above, assume t = s r . Thus, 

r 

F*(a\F) = F^a|.F) + (l-r)#(a|aV.F) + ^/i(T Si ,a|a^ V .F) 

j=i 

r-1 

= F*(a|.F) + (1 - r)H{a\a V .F) + ^ h(T s ., a\a s > V .F) 

j=i 

r-1 

= F*{ot\F) + J2 {h{T Si ,a\a s ' V F) - H{a\a V.F)). 
i=i 

The second equality occurs because a V T~ x a refines a implies h(T Sr ,a\a Sr ) = 0. Since 
h(T s , a\a s V J 7 ) < H(a\a V J 7 ) for each s 6 S 1 , the above equality implies the lemma. □ 

Definition 14. If JF c £> is any T(G)-invariant cr-algebra then define 

f(a\F) := lim F(a n \F) = inf F(a"|.F), 

rt— >oo n 

/*(a|.F) := lim F*(a n |.F) = inf F*(a n |.F). 

n— >oo n 

The previous proposition and proposition 14.31 implies that this is well-defined. When we 
need to emphasize the dependence on /i and/or T we will write f(fi, a|.F) or f(T, a\T) for 
/(a|.F) and similarly for F, F*, /*. 

Next we investigate the continuity properties of these functions. 

Proposition 5.2. Lei G rx T (X,B,fi). Let V be the space of partitions a of X with 
H(a) < oo. Endow V with the topology induced by the Rohlin distance (definition^. Then 
F and F* are continuous on V and f and /* are upper semi- continuous on V . 
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Proof. It is immediate that F is continuous. Y. Sinai proved that for every s G S, the 
function a t— > /i(T s , a) is continuous on "P (see for example [G103]). From this it follows that 
F* is continuous. The function a t— > a n is continuous by lemma 4.2 of [Bo08b] (it is also an 
easy exercise). Thus, each of / and /* is an infimum of a sequence of continuous functions. 
This implies that / and /* are upper semi-continuous. □ 

Later (in lemma 19.51) we investigate the continuity properties of the above functions in 
the variable /i rather than a. 

Theorem 5.3. Let G rx T (X,B,fi). If a and (3 are any two generating partitions with 
H(a) + H{0) < oo and T C B is any T{G) -invariant o-algebra then f(a\J 7 ) = f^PlJ 7 ) and 
/*(a|jF) = f^PlJ 7 ). Thus, we can define /(T|jF) = f{a\J r ) and /^(TlJF) = f^a^) for any 
finite- entropy generating partition a. 

Proof. This follows from theorem 14. II and propositions 14. 31 15. II and [ 5.21 To see this, note that 
from theorem 14.11 there exists a sequence of partitions a n with d(a n , (3) — > as n — > oo and 
integers m(n),p(n) with a n < a m ^ < ac$ n '. Thus a < < a m(n ) +p(n ). From proposition 
14.31 this implies that Q/ m ( n )+p( n ) is a splitting of a n . Thus for every k, a m ( n )+p( n )+ k [ s a splitting 
of By proposition 15.11 this implies F(a m ^ +p ^ +k ) < F(a^). The definition of / now 
implies that f(a) < f(a n ) for all n. By the previous proposition, / is upper semi-continuous. 
Since a n converges to j3, this implies f(a) < f(/3). By reversing the roles of a and f3 we 
obtain the reverse inequality. Hence f(oc) = f(/3) as claimed. The conditional case and the 
case of /* in place of / are similar. 

□ 

In section |9l it is proven that /(T|JF) = /*(T|.F). The proof uses Markov processes which 
are studied next. 

6 Markov Processes 

Definition 15. A G-process is a quadruple (T, X, fi, a) where G rx T (X,B,fi) and a is a 
partition of X. 

Definition 16. Two processes (T, X, /x, a) and (U,Y,u,/3) are isomorphic if there exists 
conull sets X' C X, Y' C Y and a measurable map cf) : X' — > Y' with measurable inverse 
0-i . y' — > X' such that 0*// = u, 4>{T g x) = U g <p(x)\/g G G,x G X' and (fi^a = (3 (i.e., 
induces a bijection from a to 0). 

Definition 17. The left-Cayley graph of (G, S) is defined as follows. Its vertex set is 
G and for every g G G and s 6 S there is a directed edge from g to sg there are no other 
edges. If F C G then the left-subgraph induced by F is the subgraph of Yl that has 
vertex set equal to F and contains every edge in Tl whose endpoints are in F. A set F is 
left-connected if the left-subgraph induced by F is connected. 

This is opposite the right-Cayley graph which was defined earlier (definition [TBI . 
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Definition 18. For all gx, gi G G let Past(gi; g<i) C G be the set of all / G G such that every 
path in the left-Cayley graph Tl from / to g\ passes through g2. 

Definition 19. For any measure \x on X, any Borel set A C X and any a-algebra JF, let 
/i(A|jF) : X — > R denote the conditional expectation of the characteristic function of A 
with respect to T. 

Definition 20. A process (T, X, fi, a) is a Markov process if for every s G 5 , g G G and 
every A & a 

/ePast(sg;g) 

for G X. The second equality above is automatically true since T g preserves //. By 

lemma [37T1 this is equivalent to: 

# (T-VI \/ T 7 lft ) = HiT-'alT^a) = H{T; l a\a). 

/ePast(sg;g) 

for every g G G. 

The main result of this section is: 

Theorem 6.1. If(T, X, /x, a) zs a Markov process and (3 is a coarsening of a then f(T, a\(3 G ) = 
F(T,a\/3 G ) = F*(T,a\/3 G ) = f*(T,a\/3 G ) where (3 G is the smallest T{G) -invariant a-algebra 
containing [3. 

In section [TTl we prove a converse: if F(T,a) = f(T,a) then (T, X, /i, a) is Markov. In 
order to prove the above, we will need some lemmas. 

Lemma 6.2. If (3 < a are partitions, T\ C Ti are o~ -algebras and H{a\J r i) = if(a|JF 2 ) ; 
then HiPl^) =H{fl\F 2 ). 

Proof. This follows from the fact that conditional expectation is additive. □ 

Lemma 6.3. If (T,X,fi,a) is a Markov process and a is a splitting of a then (T, X, /i, er) 
is a Markov process. 

Proof. By induction, we may assume that a is a simple splitting of a. So there exists at G S 
and a coarsening (3 of a such that a = a V Tf [3. It suffices to prove that 

Hfe 1 * I V T A) = H{ T*> I 

/ePast(sg;g) 

for every s & S and g G G. 
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Case 1. Assume s ^ t. Then / G Past(sg; g) implies tf G Past(sg; g). So, V/£Past(sg-g) -^f 1(7 

^(r s >| V/ T 7 V ) = H { T >\ V T 7 lft ) W 

/ePast(sg;g) /6Past(sg;g) 

= ff(l^a| \/ Tfa) (2) 

/ePast(sg;g) 

+ J ff(r s9 1 Tr 1 /3|T S9 1 aV \/ TJ l a). (3) 

/ePast(sg;g) 

For the first summand, note that since tg G Past(sg; g) and /3 < a, 
HiT^alT-'a) > H(T~ga\T~ x ot V T^ 1 ^) 

> H(r£a\ V I 7 1 «)= ff ( I 5 1 «l I i" 1 «)- 

/ePast(sg;g) 

Hence 

^(r-^l \/ T 7 lft ) = H{T^a\T; l a V T" 1 /?). (4) 

/ePast(sg;g) 



For the second summand above, note that {g,tg} C Past(sg;g) and {sg} U Past(sg; g) C 
Past(£sg; sg). Hence, 

H(T£a\T£a) > (I^a^a V T^V T^a) 

/ePast(sg;g) 



/£Past(tss;sg) 

Hence 



H(T- l g a | T^a V \/ Tj l a) = H{l£a\T£a V T" 1 /? V T~ x 



a) 



/£Past(sg;g) 

By the previous lemma this implies 

e(t-\(5 I T"> V \/ Tj x a) = H(T£p\T£a V 2£V V T g l a). (5) 

/ePast(sg;g) 

Equations EJ [31 H] and [5] imply 

V T 7 l(T ) = H{T; x a\T; l ayT~ g l ^ + H{T^p\T; g l ayT^pyT; l a) 

feP&st(sg;g) 

= H(T; g l a V OlT^a V T- 1 /?) = fT(T-V|27V). 
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Case 2. Assume s = t. Then 

V/ T A = V T F 1 aVTf 1 T- 1 {3 = T- 1 f3V \J Tj l a. 

f£Past(sg;g) fePast(sg;g) /GPast(sg;g) 

Since (3 < a, 

h{t£u\ \/ T f l(J ) = #(^>|0 V \/ Tfcx) (6) 

fePast(sg;g) /gPast(sg;g) 

= f(T s >|T-^V \/ T^a) (7) 

/£Past(sg;3) 

+H{T t ^f3\T~ g l ay \/ T/a). (8) 

/£Past(sg;g) 

We claim that 

H(T£p\T£a V \/ Tfa) = H(T t ^(3\T- g l a V T^P V T^a). (9) 

/ePast(sg;g) 

This follows from the same argument used to prove equation [5] except in one detail: {g, tg} ^ 
Past(sg; g) this time. However, since s = t and /3 < a, it is still true that 

H(TZ}a\T£a V J^fi V T^a) > (l^all^a V \/ T^a) . 

/ePast(sg;g) 

The rest of the proof of equation [9] is the same as the proof of equation [51 Next, 

H{T; g l a\T- g ^y \j Tj l a) = H(T s - g l a\ \/ T^a) (10) 

feP&st(sg;g) feP&st(sg;g) 

-h{t£p\ V T i la ) ( n ) 

/ePast(sg;g) 

= HiT-'a^^-HiT^^a) (12) 
= ^(T-^IT-^VT^). (13) 

The second equality above follows from the previous lemma and the fact that 

H(T- l a\ \/ Tj l a)=H(T- g l a\T g l a). 

fePast(sg;g) 

The third equality uses that s = t so T~ g a > T^~ g l (3. Equations [3, [HI 191 and [TBI imply 

/gPast(sg;g) 

= fT^a V OlT^a V T- 1 /?) = F(T- V|T-V). 

□ 
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Lemma 6.4. Let (T, X, fi,a) be a Markov process. Let (3 < a. Let (3 G be the smallest 
G-invariant o-algebra containing the atoms of (3. Then for every s G S and g G G, 

H(T; g l a\ \/ Tj l a V (3 G ) = H^a^a V (3 G ). 

fePast(sg;g) 

Proof. Since 

H(T; g l a\ \/ T-VV \/ TJ^) = H(T; g l a\ \/ T/aVf) 

/ePast(sg;g) f(P&st(sg;g) fePast(sg;g) 

< HiT-'alT^aV G ) 

< H(T; g l a\T g l aV \J T^), 

/£Past(sg;g) 

it suffices to show that 

H(r£a\ \/ T I lay V r /"^) =H(T sg l a\T- 1 aV \/ T~ l (3). 

/ePast(sg;g) / £Past(sg;g) / £Past(sg;g) 

To prove this, it suffices to show that for every left-connected (definition [TTj) finite set 
F C G - Past(sg; g) with sg G F, 

^(r-^l \/ Tfcx V \/ = H(T- g l a\T g - l a V \/ T^ 1 /?). 

/gPast(s S;9 ) feF feF 

Equivalent ly, 

/e-F /ePast( S9 ;g) feF /€Past(s S;9 ) 



/eF feF 
Thus, it suffices to prove the following two statements: 

H{T; g l av\J TJ 1 ^ \/ Tj x a) = H^aV \J Tj l 

feF fePast(sg-g) ' feF 

H{\]TJ^\ \/ Tfa) = H(\J Tf^a). 

feF fePast(sg;g) feF 

By lemma 16.21 it suffices to prove 

H( V T J la \ V T i l °) = H {\/ Tfa^a). (14) 

feF fePast(sg;g) feF 
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We will prove this by induction on \F\. If \F\ = 1, then this follows immediately from the 
definition of Markov processes. If \F\ > 1 then there exists fo G F and t G S such that 
tfo G F, tfo 7^ sg and F' := F — {tfo} is left-connected. So, 



H (\/ T F ia \ v t a) = H (y T i ia \ v T / a ) ( is ) 

/6-F /ePast(sg; 9 ) /SF' /GPast(s 9;9 ) 

+H(T- f la\\J Tj l aV \J Tj l a). (16) 

/6-F' /ePast( S9 ;g) 

By induction, 

h( V T i la \ V T 7 lft ) = V T F la K la )- ( 17 ) 

/6F' /ePast( S g;g) f£F> 

Since F' U Past(sg; g) C Past(t/o; /o), 

iy(T t7 >| \/ Tj'a) < H(T- f la\\/ Tj'aV \J T^a) (18) 
/ePast(t/ ;/o) /eF' /ePast(s S;9 ) 

< V T A) ( 19 ) 

/eF'U{s} 

< H (T~^a\Tj^ a) (20) 

= ff(r^a| V T 7 lft )- ( 21 ) 

/ePast(t/ ;/o) 

So equality holds throughout. Equations [T51 [13 and [2H imply 

^(v t ^i v T i ia ) = h { y T F ia \ T 9 ia )+ H { T t~M v t i 



/eF /ePast( S g; 9 ) /SF' feF'\j{g} 



/6F 

This proves equation [TH and hence finishes the lemma. □ 

To simplify notation, we write F{a\T) for F(T, a|jF) when T is clear. Similar statements 
apply to /(a|jF), F^(a\J r ), etc. 

Lemma 6.5. Let (T, X, /i, a) fre a Markov process. Let (3 be a coarsening of a. Then for 
any splitting a of a, F(a\f3 G ) = F(a\(3 G ). 

Proof. By lemma 16.31 it suffices to consider the special case in which a is a simple splitting 
of a. So there is a t G S such that a < a < a V Tf x a. By proposition 15.11 F{a\(3 G ) < 
F(a\f3 G ) < F(a V T t -1 a|/3 G ). Hence it suffices to show that F(a\p G ) = F(a V T^ l a\f3 G ). If 
G is a group rather than a semigroup then by G-invariance, 

F(a V T t -1 a|/? G ) = F(a V T,lia|/3 G ). 
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So, without loss of generality, we may assume that t = s r G S . 
We claim that 

F(a VT- l a\p G ) = F(a\(3 G ) + (1 - 2r)H{T- 1 a\a V (3 G ) (22) 

r 

+ H ( T t^ V T^T t - l a\p G V a V T?a). (23) 

i=l 

To see this, in the formula for F(a V Tf a\[3 G ), replace H(a V ta\(3 G ) with if (a|/3 G ) + 
if (T t -1 a:|a V /3 G ) and for each s G {si, . . . , s r } replace 

if (a V T^a V T~ 1 a V T , ~ 1 T t -1 a|/3 <? ) 

with 

If (a V T~ 1 a|/3 G ) + ff(T t _1 a V r^T" 1 ^ V a V T^a). 

Collecting terms implies the claim. 
Note that for any s G 5, 

if(T t -1 a V r~ 1 T f " 1 a|/3 G V a V T s -1 a:) = H(T^a\l3 G V a V T^a) (24) 

+ff(T a - 1 T t _1 a| / 9 G V a V T s _1 a V T^a). (25) 

If s = t = s r then the above quantity equals + HiT^a]^ V a V Tf a). By the previous 



lemma, this equals if(T t2 1 a|/3 G 'vT t x a) = i/(T t a|/3 G Va). Now substitute this into equation 



to obtain 



F(a V T~ l a\f3 G ) - F(a\f3 G ) (26) 

r-l 

= H ( T f 1( * V r s T 1 T t _1 a|/3 G V a V T?a) - 2ff(T t - 1 a|a V (3 G ). (27) 

If s i then the previous lemma implies 

H{Tt X a VT^T^alP Va VT^a) = H{T- l a\fi G V aV T^a) 

+H(T- 1 T t - 1 a\f3 G V a V T; l a V T^a) 
= H(T t ~ l a\f3 G V a) + H [T^T" 1 a\l3 G V T^a) 
= 2H(T t - 1 a\a V /3 G ). 

Equation [261 now implies F(a V T t _1 a|/5 G ) = F(a|/3 G ) as claimed. □ 

We can now prove theorem 16.11 

Proof of theorem \6.1\ This first equality follows from the previous lemma and definition [1 
To prove the second equality, first note that for any s G S, 



n 

h(T B , a\(3 G ) = lim H (T- n - x a\(3 G V \J T t 



s «l- 



i=0 
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Hence, 

H(a V T s _1 a;|/3 G ) — H(a\(3 G ) = H{T^a\a V (3 G ) 



= HiT'^alT^a V 0°) 

n 
i=0 

> /f^r-"-^!^ V \j Tj l a 

/GPast(s"+ 1 ;s n ) 

= H{T; n ~ l a\T- n a V /5 G ) = ff^ala V f3 G ). 
Thus, equality holds throughout. Hence 

h(T s , a\(3 G ) = H(a V T; l a\fi G ) - H(a\(3 G ). 

We now have 

r 

F*(a\P G ) = (l-r)H(a\f3 G ) + J2HT Si ,a\p G ) 

i=i 

r 

= (1 - 2r)H(a\p G ) + # (a V T s -1 a;|/3 G ) = F(a|/? G ). 



8=1 



This proves the second equality in the statement. By proposition 14.31 a n is a splitting of 
a. By lemma [6\3| (T, X, fi,a n ) is a Markov process. Thus by the above, F*(a n \f3 G ) = 
F{a n \f3 G ) = f(a\(3 G ) for all n > 0. Take the infimum over all n to see that f*(a\f3 G ) = 
F(a\P G ). □ 



7 Markov Chains 

The purpose of this section is to develop a constructive approach to Markov processes through 
transition matrices and symbolic dynamics. This will be used later to prove / = /* in general. 

7.1 The existence theorem 

Definition 21. An ordered process is a quadruple (T,X,fi,a) where (T,X,fi) is a G- 
system and a = (Ai,A 2 ,...) is an ordered partition. Two ordered processes (T, X, fi, a), 
(S, Y, v, /3) are isomorphic (as ordered processes) if there is a measure-conjugacy <fi : X — > Y 
that maps the i-th atom of a to the z-th atom of f3 for all i > 1. 

Definition 22. Let X = (T,X, fi,a) and Y" = (U,Y,u,/3) be two ordered processes with 
a = (A u A 2 , . . .) and (3 = (B u B 2 , ■■■)■ For n > let 

oo 

ses 1 i,i=i 
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Here we are following the convention that if, for example, a = (Ai, . . . ,A n ) is finite then 
Ai := for all i > n. d\ is symmetric and satisfies the triangle inequality but it is not 
a distance function since two nonisomorphic processes could be at distance zero from each 
other. 

The main result of this section is: 

Theorem 7.1. Let Y = (U,Y,v,/3) be an ordered process. Then there exists a Markov 
process X = (T,X, /i,a) such that di(X,Y) = 0. Moreover, Y is unique up to isomorphism 
(as an ordered process). 

7.2 Symbolic dynamics notation 

If K is any topological space then K G denotes the set of all functions x : G — > K. It can 
also be thought of as the product space K G = Yl g eG ^ anc ^ nence is endowed with the 
product topology. In most of the applications of this paper, K is either finite or countably 
infinite. In these cases, it is implicitly assumed that K has the discrete topology and this 
induces the product topology on K G . The canonical action of G on K G is defined by 
Tgx(f) = x(fg)Vf,g E G,x E K G . The canonical partition of K G is a = {A k \ k E K} 
where A k = {x E K G \x(e) = k}. 

A measure /i on K G is invariant if ^{T^E) = n(E) for all Borel E C K G and g E G. Let 
M(K G ) denote the space of all invariant Borel probability measures \x on K° . The weak* 
topology on M(K G ) is defined as follows. We say that a sequence {n n }^=i C M(K G ) 
converges to n E M(K G ) in the weak* topology if and only if for every continuous function 
/ : K G — > M, linin^oo J f dji n = J f d/i. Equivalently, lim^oo \i n = ji (weak*) if and only if 
for every m > and every A E a m , limn^oo /i n (A) = fx(A). 

7.3 Transition Systems 

Definition 23. Let K be a finite or countably infinite set. A stochastic matrix P with 
state space K is a K x K matrix P = {Pij) such that 

• < Pi j < 1 for all i,j, 

• for each i, Y.jeK P ij = 1 - 

A 1 x K vector n is a probability vector if its entries are nonnegative and sum to one. If, 
in addition, nP = n then ir is a steady state vector for P. 

Definition 24. A transition system for (G, S) is a collection of stochastic matrices 
{P s } sG 5 and a probability vector it. It is an invariant transition system if the following 
hold. 

• For all s E S, n is a steady state vector for P s . 
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• If G is a group rather than a semigroup then for all s G S , i, j G K, 7TjP?- 1 = 7TjP^. 
Just to be careful, note that P s 1 is not the inverse of P s . It is equals P* where t = s" 1 . 

Even if G = Z, this definition differs from the classical case in a minor detail. Typically, 
only one transition matrix is given. But the above definition requires two: P s and P s . 
Of course, the second condition above implies that P s is determined by P s so the two 
definitions are really equivalent. This redundancy will make forthcoming arguments a little 
simpler. 

Definition 25. The Markov chain over G induced by the transition system P := ({P s } sg s, ir) 
is the G-indexed set of random variables (X g ) g& G satisfying the following conditions: 

• The distribution of X e equals it. I.e., for any k G K, the probability that X e = k 
equals tt^. Formally, Pr(X e = k) = -n k . 

• Let g G G and s G S be such that \sg\ = \g\ + 1 where | ■ | denotes word length. Let 
/i, ...,/„ G Past(s#; g) - {g}. Then for any k, ko, . . . , k n G K , 

Pr(X sg = k\X g = ko, X fl = k u . . . , X fn = k n ) = Pr(X sg = k\X g = k ) = P^ k . 
It is an invariant Markov chain if P is invariant. 

Definition 26. For any measure /ionl and any Borel sets A,B C X with /i(P) > define 

n(AnB) 



fi(A\B) 



H{B) 



Definition 27. Let (X g ) g& G be defined as above. Define the random function x : G —>■ K 
by x(g) = X g . Let /i be the probability measure on K G equal to the law of x (i.e., for any 
Borel E C K G , fi(E) is the probability that x is contained in E). 

We say that (T, K G , /i, a) is the process induced by the transition system P := ({P s } sg5 , n) 
(where T is the canonical action of G on K G and a is the canonical partition of K G ). Corol- 
lary 17.51 below shows that it is Markov. 

The conditions on (X g ) g& c stated above can be restated in terms of the measure \i as 
follows. For each k G K, let A/, = {y G K G \ y(e) = k}. Then 

• For all k G K, fi(Ak) = ^k, 

• Let g G G and s G S be such that \sg\ = \g\ + 1. Let fi,...,f n G Past(s^; g) — {g}. 
Then for any k, k Q , . . . , k n G K, 



n 

^{r; g l A k \T; l A kQ n f| r^ 1 ^) = ^(t- 1 ^] n 27^) = p, 



In order to prove that (T, i^, /i, a) is a Markov process, we first need to prove that \x is 
T-invariant (when P is invariant). This is accomplished next. So fix an invariant transition 
system P := ({P s } sg s, tt). For the next three lemmas, the identity element in G is denoted 
by id. 
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Definition 28. Let be the left-Cayley graph of (G, S) (definition [T7I) . If e is an edge of 
T^, let e_, e+ denote the endpoints of e where e_ is the vertex that is closest to the identity 
element in Yl- If F C G is any set, let E(F) denote the set of edges e in Tl that are directed 
from e_ G F to e+ G F. 

Lemma 7.2. Lei F G G be a finite left- connected set with id G F . Let z : F — > K be an 
arbitrary function and let 

C = {xeK G \ x{g) = z(g)Vg G F} 

be the cylinder set induced by z and F. For each edge e G E(F), Let p z (e) = P*j where 
z(e_) = i and z(e + ) = j and s G S is such that se_ = e+. Then 

eeE(F) 

Proof. This is immediate from the definition. □ 

Our proof of invariance handles the group case separately from the semigroup case. 

Lemma 7.3. Suppose G is a group. For allgeG and all Borel E C K G , /i(E) = ^(T~ l E). 
I.e., fi is Tg-invariant. 

Proof. Let F, C, z be as in the previous lemma. Assume that S C F. Let t G S. We will 
show that n(C) = p,(T^ l C). Let t~ l z : Ft -> K be the function (t~ 1 z)(ft) = z(f) for all 
/ G F. Then 

T^C = {xeK G \ x{gt) = z(g) Wg G F} 

= {xeK G \x(g) = (r 1 z)(g)VgeFt}. 

The previous lemma implies 

e&E(Ft) 

If e is an edge of Tl then let e • t denote the edge with endpoints e_t and e+t. 

Claim : Either (e_,e + ) = (id,^ 1 ) or ((e • t)_, (e • £)+) = (e_t, e+t). 

To prove the claim, let g G G, s G 5 be such that (e_,e+) = (g,sg). Let j denote the 
path in from id to sp. Then j ■ t is the path in from £ to sp£. If |gr| > 1 then this path 
has length at least 2. This implies \t\ < \gt\ < \sgt\. I.e., (e ■ t)_ = e_i and (e ■ £)+ = e+t. 
The case |g| = (i.e., g = id) is obvious. This proves the claim. 

The claim implies that if e G E(F) is such that (e_,e+) ^ (id, t -1 ) then pt-i z (e-t) =p z (e). 
So if we let e* be the edge from id to t^ 1 then 

[i{T~ l C) = n z (t-i)Pt-*-x{e* ■ t) Tl Pz ^- 

e€E{F)-{e,} 
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Let % = z(id) and j = z(t 1 ). By definition of p and the definition of an invariant 
transition system, 

MT^C) = KjPt n p z (e)=ir i Pf J] Pz (e) = f,(C). 

e<EE(F)-{e*} e€E(F)-{e t } 

Since this is true for all cylinder sets C whose domain contains S, it is true for all cylinder 
sets (since any cylinder set is a disjoint union of such sets). Since the cylinder sets generate 
the Borel cr-algebra of K G , it follows that /i(T t ~ E) = //(F) for all Borel sets E C K G . Since 
this is true for all £ G S and S generates G, it follows that \x is T 5 -invariant for all g E G. □ 

Lemma 7.4. Suppose G is a semigroup. For all g E G and all Borel E C K G , //(F) = 
n(T~ 1 E). I.e., \x is Tg-invariant. 

Proof. Let F, C, z be as in the lemma I7T21 Let t G S. For each k G K, let Zk : {id} UFt — > F 
be defined by Zk(ft) = z(f) if / G F and z k {id) = k. Let = {x G F G | x(g) = Zk(g)Vg G 
{id} U F}. Since Tf 1 C is the disjoint union of Ck over G K, it follows from lemma [7T2l 
that 

MTr 1 c) = ^ / i(c fc ) = E^ II p*( e )- 

k&K keK e£E({id}UFt) 

Let e* be the edge from id to t. Then p Zfc (e*) = P^ where z(id) — I. If e G F then e - t E Ft 
and ((e-t)_, (e-t)+) = (e_*,e+t). Hence p 2fc (e-t) =p»(e). Also, F({id} UFt) = F(F)U{e*}. 
Thus, 

//(T^C) = J> fc P<, J] P«(e) = ^ II P-( c ) = MC0- 

k£K eeE(F) e£E{F) 

The second equality follows from the assumption that tt is a steady state vector for P*. 

Since the cylinder sets generate the Borel a-algebra of K G , it follows that //(T t _1 F) = 
//(F) for all Borel sets E C F G . Since this is true for all t G S and 5 generates G, it follows 
that // is T 9 -invariant for all g E G. □ 

Corollary 7.5. y4n?/ process (T, F G , //, a) induced by an invariant transition system P is 
Markov. 

Proof. This follows immediately from the previous two lemmas. □ 

Corollary 7.6. If(T, K G , //, a) zs induced by an invariant transition system P = ({P s } se s, tt) 
then 

f(T) = (2r - 1) ^ t< logW "EE ^ M^)- 

iGK" ses+ i,jeK 

Here S + = {si, . . . , s r }. 

Proof. This follows from the previous corollary and theorem 16.11 □ 



Proof of theorem 7J_. Let (3 = (Pi, P2, . . .) and K — N. Let 7r be the 1 x X- vector defined 
by 7Tfc = u(Bk). Let P?- = u(U~ l Bj | Pj). It is a simple exercise (using the P-invariance of v) 
to check that P = ({P s } se 5, it) is an invariant transition system. Let X = (T,K ,//,«) be 
the Markov process induced by P. It is automatic that d\(X, Y) = 0. This proves existence. 
Uniqueness is trivial. □ 
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8 Examples 



In this section, we give three examples of Markov chains over free groups; one related to 
the Wired Spanning Forest, to perfect matchings, and a third one with negative /-invariant. 
These are not used in the rest of the paper. 

8.1 The Wired Spanning Forest 

The uniform spanning tree (UST) on a finite graph is a subgraph chosen uniformly at random 
among all spanning trees. In [Pe91], R. Pemantle answered a question of R. Lyons by showing 
that if Q is an infinite graph and if Q\ C Q2 C ... is an exhaustion of Q by finite connected 
subgraphs, then the weak limit of the UST on Q n exists. The limit is called the free spanning 
forest (FSF). In his proof, R. Pemantle introduced another model that is now called the wired 
spanning forest (WSF). It is defined as follows. As above, let Q\ C Q 2 C . . . be an exhaustion 
of Q by finite connected subgraphs. Let Qf be the graph Qi with all of its boundary vertices 
identified (i.e., wired) to a single vertex. Then the WSF on Q is the weak limit of the UST 
on Qf as i — > 00. See [BLPS01] for a thorough study of the construction and properties of 
the FSF and WSF as well as references to other works on the subject. 

Here we are interested in the WSF on the left-Cayley graph r = T L of the group G = 
(si, . . . , s r ). We will describe it as a Markov chain over G with state space S = {sf 1 , . . . , s^ 1 }. 
But before this, we give a little intuition as to what we are doing. 

Let x : G — > S be a function. Let F x be the subgraph of T defined as follows. An edge 
from g to sg is in F x if and only if either x(g) = s or x(sg) = s -1 . It is automatic that F x is a 
spanning forest because the Cayley graph is a tree. Now, suppose x satisfies the following 
condition: if x(g) = s G S then x(sg) 7^ s -1 . In this case, F x has no finite components. The 
Markov measure ji on S G that we will define is maximally symmetric and has the property 
that if x : G — > S is a random element drawn according to /i then x satisfies the above 
condition so that F x has no finite components. 

The transition system of the Markov chain is denoted here by P = ({P s } se s, it) as 
usual. In agreement with the above discussion, -P s s s -i = for all s e S. The symmetry 
considerations lead to the following values for every s G S. 




^ = j^pri for alH ^ s_1 ' 

Pt,-i = r^—r for all t ? s, 



'S 



51-2 



for all u,v G S with u 7^ s,v 7^ s 



-1 



uv 
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So, the /-value of this system is: 

f (||| Iog(|5|(|5| - 1)) + ^ tog( flg^iM ) - (\S\ - 1) log(|S|) 

= (1 + " |5| + 1) Iog(|S|) + (\S\ - 1) Iog(|5| - 1) - Iog(|5| - 2) 

= (1 - r) log(2r) + (2r - 1) log(2r - 1) + (1 - r) log(2r - 2). 

Using Wilson's algorithm [Wi96], it can be proven that the random graph F x (where x 
has law given by the above Markov measure) is the WSF. For a comparison, let Q n be a 
connected graph on n vertices such the random weak limit of the sequence {Q n } is a 2r- 
regular tree (see [Ly05] for definitions). Improving on an earlier result of [Mc83], in [Ly05] it 
is proven that the exponential growth rate of the number of spanning trees in Q n is exactly 
(1 - r) log(2r) + (2r - 1) log(2r - 1) + (1 - r) log(2r - 2). 

8.2 Perfect Matchings 

There is a natural random perfect matching on the left-Cayley graph r = Tl of the free 
group G = (si,...,s r ). We will describe it as a Markov chain over G with state space 
S = {sf 1 , . . . , sf 1 }. But before this, we give a little intuition as to what we are doing. 

As in the previous example, let x : G — > S be a function. Let F x be the subgraph of Tl 
defined as follows. An edge from g to sg is in F x if and only if either x(g) = s or x(sg) = s -1 . 
It is automatic that F x is a spanning forest because I\ is a tree. Now, suppose x satisfies 
the following condition: if x(g) = s G S then x(sg) = s -1 . In this case, every component of 
^ consists of a single edge. So is a perfect matching. The Markov measure fx on S* 6 that 
we will define is maximally symmetric and has the property that if x : G — > 5 is a random 
element drawn according to then x satisfies the above condition so that F x is a perfect 
matching. 

The transition system of the Markov chain is denoted here by P = ({P s } sg>5 , n) as usual. 
In agreement with the above discussion, -P s s s -i = 1 for all s E S. Thus, P s s t = for all t ^ s _1 . 
Imposition of maximal symmetry conditions leads to the following values for every s G S. 

1 

* a= \SY 
P t s s _, = for all t ^ s, 

P£ v = jw, for all u,v G S with u ^ s,v ^ s -1 . 

\b\ — 1 

So, the /-value of this system is: 

-(V2) ( E E M^*)) + (2r - 1) E ^ lQ g(^) 

= (1/2) Iog(|5|) + (^y 1 ) log(\S\(\S\ - 1)) - (2r - 1) tog(|5|) 
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For a comparison, let G n ,2r be a graph chosen uniformly at random among all 2r-regular 
graphs on n vertices. In [BM86], it is proven that E[M n ], the expected number of perfect 
matchings on Q n .2r is asymptotic (as n — ► oo) to 

V2e 1/4 exp ( - (^2^) log(2r)n + (j^—) M2r - l)n) . 

8.3 A mixing Markov chain with negative /-invariant 

Proposition 8.1. IfG is a nonabelian free group then there exists a Markov process (T, K G , /i, a) 
such that — oo < /(T) < 0. 

Proof. Let < e < 1 be given. Let K be a two-element set. Let 7r = [||]. For each s G 5, 
let 

ps = e 1 - e 
[l-e e 

It is easy to check that P = ({P s } se s, 7r) is an invariant transition system for all e G [0, 1]. 
Let (T, K G , n e , a) be the induced Markov process. Its /-value, denoted f(T,fi e ), varies 
continuously with e. Since f(T, /i ) = — (2r — 1) log(2) < 0, (T, K G , a) is a Markov 
process with negative /-invariant for all e > sufficiently small. □ 

In [Bo09] it is shown that no Bernoulli shift factors onto a shift with negative /-invariant. 
Hence each system constructed above is not even weakly isomorphic to a Bernoulli shift. It 
is interesting to compare this with the well-known result [FO70] that every mixing Markov 
chain over the integers is isomorphic to a Bernoulli shift. By comparison, it can be proven 
that for e G (0,1), the systems constructed above are uniformly mixing. This leads to an 
interesting open problem: classify mixing Markov systems over a free group up to measure- 
conjugacy. 

9 Markov approximations and the proof that f = f* 

The purpose of this section is to prove: 

Theorem 9.1. Let (T, X, fi, a) be a G-process with H(a) < oo. Let (3 be a partition of X 
with H{(3) < oo and f3 G C a G . Then f*(a\(3 G ) = f(a\(3 G ). 

I do not know if the result holds if H((3) = +oo. The proof is an application of theorem 
16.11 We will approximate the given process by a sequence of Markov processes. The first 
step is to embed the given process into a symbolic process as defined next. 

Definition 29. A process of the form (T, K G , /z, a) where T is the canonical action on K , 
K is finite or countably infinite and a is the canonical partition is a symbolic process. 

Lemma 9.2. Let (S,Y,v,(3) be a G-process. If (3 is generating then there is a canonical 
process isomorphism : (S,Y,v,(3) — > (T, (3 G , //, a) where (T, (3 G , //, a) is symbolic. 
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Proof. For y G Y define <j>(y) : G -> /3 by 0(?/)(#) = B ii S g y E B e /3. Let T be the 
canonical action of G on /3 G . 

If / G G then <f)(S g y)(f) = BiES f S g y G -B iff Sf g y G 5 iff <j>(y)(fg) = B iff (T g( j)(y))(f) = 
B. So is G-equivariant. If B G /3 then 0(5) = {a; G (3 G \ x(e) — B}. Thus maps /3 to 
the canonical partition of j3 G . Let \x = <f>*(v). is invertible because f3 is generating. □ 

Lemma 9.3. Let (T, lf G /x, a) 6e a symbolic process. Let (3 be a partition with a < (3 < a n 
for some n > 0. Then there exists a unique measure \ip such that (T, K G , fip, (3) is Markov 
and 

d 1 ((T,K G , f t,P),(T,K G ,^,P))=0. 

Proof. By the previous lemma applied to (T, K , /i, /3), there is a canonical G-equivariant 
embedding : K G — > /3 G . Let {{7 s } ge G denote the canonical action of G on /3 G and let 7 
denote the canonical partition of (3 G . Consider the process (U, (3 G , 0*/i, 7). It is isomorphic 
to the process (T, K G , /i, /3). 

By theorem 17. II there exists a unique measure z/ on (3 G such that (U,P G , 7) is Markov 
and 

di((?7,/3 G , 0,/i,7),(f/,/3 G ,^7)) =0. 

Let be the pullback <p*(v). It follows from the fact that a < (3 < a n that the support 
of v is contained in the image of 0. So \x$ is a well-defined G-invariant probability measure. 
In fact, (T, K G , /ig, (3) is process-isomorphic (via 0) to (U, (3 G , u, 7). So (T, K G , fip, (3) is a 
Markov process. It is easy to check that 

d 1 ((T,K G , l x,(3),(T,K G ,^,(3))=0. 

□ 

Definition 30. If (T, i^ G , fi,a), (3 and /ig are as in the previous lemma then fip is called the 
Markov approximation to /i induced by (3. 

Lemma 9.4. Let (T, K G , /i, a) be a symbolic process. Let {/3 n }^=i be a sequence of par- 
titions such that for all n there exists integers I(n) < J(n) with a 1 ^ < (3 n < a J ^ and 
lirn n _ ! . 00 /(n) = 00. Then \i$ n converges to fi in the weak* topology. 

Proof. Since 

d a ((T, K G , ^ n ,f3 n ), (T, K G , //, (3 n )) = 0, 

Hp n {B) = fi(B)VB G (3 n . Hence Hf3 n { B ) = n(B)\/B G a I{n l Since lim^oo J(n) = +00, this 
implies the lemma. □ 

Before proving theorem 19.11 we need to prove that / and /* are upper semi-continuous 
in the variable /i. As in §7.21 let M(K G ) denote the space of all invariant Borel probability 
measures on K G where K is finite or countable. If /i G M(K G ) and (3 is a partition of K , 
let /(/!, f3) be the /-invariant of the process (T, K G , /1, /3) where T is the canonical action. 
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Lemma 9.5. Let a be the canonical partition of K G . Let J 7 be a T(G) -invariant Borel 
a -algebra. Then the map \i \— > /*(/!, oi\J-) is upper semi- continuous with respect to the weak* 
topology. Similarly, the function fi > f(n, a\J^) is upper semi- continuous with respect to the 
weak* topology. 

Proof. It is well-known that for every s G S, the function /i \— > h(T s , /i, a|jF) is upper 
semi-semicontinuous in the variable fi (e.g., [G103, lemma 15.1, page 270]). For example, 
this follows from the fact that, for every n, the function fi ^ ■^^H(fi,\f^ =0 T~ k a\J-') is 
continuous (since conditional expectation with respect to T is continuous) and h(T s , /z, a\F) 
is the infimum of these functions. Thus, for every n, the function fi t— > -F*(/i, a n |jF) is upper 
semi-continuous. Since /*(/i, a|JF) = inf n F*(/i, a"]^ 7 ), the lemma follows. The proof for / 
in place of /* is similar. □ 

Proof of theorem \9.1\ After replacing a with aV/3 if necessary, we may assume that a refines 
(3. We may also assume that a is generating. So after applying the canonical embedding 
(lemma 19 . 2 j) . we may assume that X = K G and a is the canonical partition of K . 

For each n, let \i n = fi a n be the Markov approximation to /i induced by a" '. We claim 
that 

f(fi,a\f3 G ) = \imF(^a n \f3 G ) = \imF(^a n \/3 G )=\imF,(fi n ,a n \/3 G ) 

n n n 

= hmMfx n ,a\p G )<f^,a\p G ). 

n 

The first equality holds by definition of /, the second holds since di ((T, K G , fi n , a n ), (T, K G , fi, 
0. The third and fourth equalities follow from theorem 16.11 The previous lemma and lemma 
19.41 imply the last inequality. 

For the reverse note that for any s e 5 and any n > 0, 

m 

h{T s , a n \(3 G ) = lim H (t'^^I W T~ l a n V f3 G ) 

i=0 

< H{T~ x a n \a n V (3 G ) 

= H(a n VT- l a n \p G )-H(ii,a n \f3 G ). 

Thus F(a n \l3 G ) > F*(a n \f3 G ). Take the limit as n -> oo to obtain f(fi,a\(3 G ) > f*{n,a\/3 G ). 

□ 



10 The Abramov-Rohlin Formula 

We can now prove theorem 11.31 

Proof of theorem \1.3l By theorem |9.1[ it suffices to prove that f*(a\[3 G ) = V/3) — f*(f3). 
The classical Abramov-Rohlin formula implies that if n, m > and s G 5* and if denotes 
the smallest T s -invariant u-algebra containing (3 m then 

h(T„ a n \(3T) = h{T s , a n V /T) - h{T„ /T). 
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The definition of F* now implies F*(a n |/? m ) = F*(a n V fi m ) - F*(/3 m ). Thus, 
/,(«|/? G ) = lim lim F*(a n \f3 m ) = lim lim F*(a™ V /T) - F*(/T) = V 0) - U{(3). 



n— >oo m^oo ri— >oo m^oo 



The last equality follows from the fact that F* is monotone decreasing under splittings 
(proposition I5.ip and proposition 14.31 □ 

11 A characterization of Markov processes 

The purpose of this section is to prove: 

Theorem 11.1. A G-process (T,X, fi,cx) is Markov if and only if F(a) = /(a). 

This theorem is not used in the rest of the paper. 

Corollary 11.2 (Markov processes maximize the /-invariant). Let K be finite or countable 
and let fi be an invariant Borel probability measure on K G (with respect to the canonical 
action). If a is the canonical partition of K G then f(T,fi) < F(T,fi) with equality if and 
only if (T, K G , fi, a) is Markov. 

Proof. This follows from the theorem above and the fact that f(T, fi) < F(T, ^i) always holds 
by definition of / (see definition [T4l) . □ 

Definition 31. Let (T, X, fi,a) be a G-process. If Q C G is finite then let 

« Q := V T " a - 

q&Q 

Proof of theorem \11.1\ By theorem 16.11 it suffices to prove that if f(fi,a) = F(fi,a) then 
(T, X, /i, a) is Markov. By lemma 19.21 we may assume without loss of generality that 
(T, X, /i, a) = (T, K G , fi, a) is a symbolic process. By theorem 17. II there exists a Borel proba- 
bility measure ui on K G such that (T, K G , u, a) is Markov and d\ ((T, K G , fi, a), (T, K G , u, a)) = 
0. 

Claim 1: Let Q C G be finite, right-connected and e G Q. If for some t e S , A e q/ < 2 U( 2* 
then n(A) = u(A). 

Note that the claim implies the theorem, because it implies that fi(A) = u(A) for all 
A G a n for any n > and thus /i = uo. 

The claim is proven by induction on \Q\. If \Q\ = 1 then it follows from d\ ((T, K G , /i, a), (T, K G , u, a)) 
0. So suppose \Q\ > 1. Then there exists u G S and a set P C Q such that P is right- 
connected, e G P, \P\ < \Q\ and Q C P U Pm. The induction hypothesis implies that 
fi(A) = u(A) for all A G a PuPs for any s e S. 

Note that Q U Qt C P U Pu U Pt U Put. Hence it suffices to show that /i(A) = u{A) for 
all A G a P<JPuuPtuPut^ 

It suffices to show that for any A, B,C,D G a p , 

pl(A n t^p n Tr 1 ^ n t£d) = u{A n t~ x b n Tr 1 ^ n t^p). 
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If u = t and B ^ C then both sides equal zero. If u = t 1 and A ^ D then both sides equal 
zero. So we may assume that these cases do not occur. 
Note that 

[m(a n t~ x b n T t _1 c n t^-d) (28) 

= fi(A n T-iQ^DlA n T^Q^BIA n T^C n T^D) (29) 

= n t^c)/^- 1 ^ n ^T^CC 1 ^ n t^c n r^). (30) 

The last line follows from the induction hypothesis. We will show that \x can be replaced 
with u in the last line above. The next two claims help to reduce the problem. 
Claim 2: If 

li{T£D\A n T-'C) = n{T- t l D\T- l C) 

then 

h{t£d\a n rr 1 ^) = ^(t^a n r^C). 

Claim 3: If 

pl(t- 1 b\a n t^c n t~ 1 d) = ^b\a) 

then 

MT-^IA n T- l c n T~ t l D) = uiT-'BlA n t^c n t^d). 

Proof of claim 2. By lemmas and Ed (T, K G , u, ot p ) is Markov. Hence 

u(T^D\AnT t ^C) = u(T^D\T t ^C) = 

By the induction hypothesis, u(Tf C) = fj,(Tf C). By G-invariance and the induction 
hypothesis, 

UJ (Tut'D n Tf X C) = lo(T~ 1 D n C) = fi(T~ 1 D nC)= fi(T^D n r^C). 
Hence the above implies 

^T-^nT^C) = KT ^^] C) = AiC^l^g) (31) 

= MT^DlAnT^C). (32) 

The last equality follows from the hypothesis of claim 2. This proves claim 2. 
Proof of claim 3. Since (T, i^ G , w, ct p ) is Markov, 

u{t^b\a n rr 1 ^ n t^d) = w^siA) = gg£*TIi) 
/.(T- 1 ^ n A) 



»(A) 



»(T^B\A) = KT-'BIA n T^C n T~^D) 
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The third equality uses the induction hypothesis. The last equality uses the hypothesis of 
claim 3. This proves claim 3. 

Note that if ii(T~ t l D\Af\T^C) = uj{T~ t l D\Af\T^C) and fi(T~ 1 B\AnT t ~ 1 C DT^D) = 
u(T- l B\A n Tf l C n T- f l D) then equation [28] implies 

fi(A n t- 1 b n t^c n t^d) = w (4 n t- 1 ^ n t^c n t^d) 

which implies the theorem. 

If u = r 1 then, by assumption, A — D. Hence ^(T^L^nTf 1 C) = cj(T^ 1 Z>|AnT- 1 C'). 
l£u = t then, by assumption, £ = C. Hence //(T^-B^ n T t ~ l C n T~ t l D) = uiT^B^ n 
Tf'CnT^D). 

So by claims 2 and 3 it suffices to prove that if u 7^ then 

/i^-^IA n T-'C) = fx(T- t l D\T t - l C) 

and if u 7^ £ then 

/.(T-^IA n T t - l C n T~tD) = fi(T- l B\A). 

By lemma 13.11 it suffices to prove the following claim. 
Claim 4: If u 7^ t~ l then 

H(a Put \a PuPt ) = H(a Put \a pt ) = H(a Pu \a p ) (by G-invariance) 

and if u 7^ £ then 

H(a Pu \a PuPtuPut ) = H(a Pu \a p ). 

These entropies and all the ones below are with respect to //. 

Both P and P U Pu are finite, right-connected and contain the identity element. Hence 
lemma l4~2l implies a p and a PuPu are splittings of a. Proposition 15. II implies 

F(a) = f(a) < F(a PuPu ) < F(a p ) < F(a). 

So equality holds throughout. The above F and / values (and the ones below) are all with 
respect to \i. Now, 

= F(a PuPu ) - F{a p ) 

r 

= (1 - 2r)H{a Pu \a p ) + Y^H{a PuUPus ^~ PuPs > 



lot 



i=l 



= (1 - 2r)H(a Pu \a p ) + ^ #(a p? ^ |a PuPs ») + H(a Pu \a PuPs * uPus >] 
If, for some i, u = s^ 1 then 



i=i 



If, for some i, u = Sj then 



H(a Pus *\a PuPs <) =0. 
#(a Pu |a PuPs * uP ^) = 0. 
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Hence one of the terms in the above sum equals zero. Since for every i 

H(a Pus *\a PuPs <) < H(a Pu \a p ) 

and 

H(a Pu \a PuPs * uPus >) < H{a Pu \a p ) 
this implies that when u ^ s^ 1 , 

H(a Pus *\a PuPs >) =H(a Pu \a p ) 

and when u 7^ Si, 

H(a Pu \a PuPs ' uPus ') = H(a Pu \a p ). 

If, for some i, Si = t then the two equations above imply claim 4. Suppose instead that 
sj = t for some j. By G-invariance, 

r r 

F(a) = (1 - 2r)H(a) + ^ H(a V T^a) = (1 - 2r)H{a) + H ( T ^ a v «)■ 

i=l i=l 

Hence we may replace each s» in the proof of claim 4 with . This proves claim 4. As 
noted above, claim 4 implies claim 1 which implies the theorem. □ 



12 Limits of Partitions 

Definition 32. Let G rx T (X, B, /i), JF C B be a sub-cx-algebra and j/^}^ be partitions of 
X. We will write lim^oo j3i = T if for every partition a G J 7 with if (a) < 00, 

lim H(a\Pi) = 

i— >oo 

and there exists a sequence of partitions {71}°^ with 7$ C and lim^oo ^(7$, A) = 0. Here 
•) is the Rohlin distance (definition [9]). 

The purpose of this section is to prove the proposition below which will be used in the 
proof of the addition formula (theorem 11.51) . 

Proposition 12.1. Let (T, X, /i, a) be a G-process with H(a) < 00. Let {A}^ be partitions 
of X with H(Pi) < 00 such that lim^oo/^ = a G . Then 

f(a) = /» = lim = lim 

Here is an application. 

Corollary 12.2. There does not exist a finite- entropy generating partition of the canonical 
action of G on ([0, 1] G , \ G ) where A is Lebesgue measure on [0, 1]. 
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Remark 2. This result was proven first in [Bo08b] (by a different method). It is an open 
question whether it holds for all countable groups G. 

Proof. Let o~\ < a 2 < . . . be an increasing sequence of finite partitions of [0, 1] such that 
Vi^i a i * s the c-algebra of all measurable sets (up to sets of measure zero). Let 7r : [0, 1] G — > 
[0,1] be the evaluation map ir(x) := x(e). Let A be the pullback partition A := vr*((jj). 
Let Qij = It is an easy exercise to show that converges to the full <r- algebra of all 
measurable sets of [0, 1] G . So, assuming that the system G r\ T ([0, 1] G , A G ) has a finite 
generating partition, it follows from proposition 112.11 that f(T) = lim^oo We will 

show that the later limit equals +oo which contradicts the fact that the /-invariant is the 
infimum of a set of real numbers. 

Since (T, [0, 1] G , A G , A) is a Bernoulli process, it follows from a simple calculation (per- 
formed in [Bo08a]) that /(A) = -F(A) — H(/3i). Since a* is a splitting of A; this implies 
F(aii) = H{(3i). By definition, H((3i) = H(o~i). So we have F(a») = H(ai). Obviously, 
lim^oo H(ai) = +oo. □ 

We will need two simple lemmas. 
Lemma 12.3. If a,/3 are any partitions of X with H(a) + H(/3) < oo then 

\F(a) - F((3)\ < (4r - l)d(a,P). 
Proof. This follows immediately from the fact that 

\H(a)-H(J3)\ < \H(a)-H(aV P) \ + \H(a V f3) - H(J3)\ =d(a,P) 
and for any s G S, 

^(aV^a) -H(p\/T^ x P)\ < d(a V T~ x a., f3 V T s -1 /3) < 2d(a,/3). 

□ 

Lemma 12.4. Lei a, {A}?^i 6e as in proposition \12.1\ If {7i}^i a sequence of partitions 
with limi_oc d(7i, A) = then lim*-^ % = a G . 

Proof. Let uj < a G be any partition with H(lo) < oo. Then 

#(w| 7i ) = HiuVyd-Hirii) 

< \H(u V 7i ) - if (a; V A) I + \H(u V A) - #(A)I + |#(A) - #(7*)| 

< d(w V 7i ,w v A) + +d(ji,Pi) 

< H(u\(3 i )+2d(^,(3 i ). 

The result now follows from the hypothesis that lim^oo A = ct G ■ D 
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Proof of proposition \12. 1\ Since lim^oo $ = a G , there exists partitions ji C a G such that 
d((3i,li) 0. Since 7« C a G , we can assume, without loss of generality, that 7$ < for 
some n(i) G N. For every m > 0, if (a; m |/^) — > implies that d(a m V7j, — ► too. So there 
is a sequence m(i) such that lim i ^ 00 m(z) = +00 and d(a m w v 7i, A) ~~ * 0. After replacing 
7j with a m W v ^ we ma y assume that a" 1 *-*-* < ji < a n<yl \ 

Propositions 14.31 and 15.11 imply F(/j,,a n ^) < F(/j,,ji). Thus 

/(/x, a) = inf F(fj,,a n ) < liminf F([i, 7^ < limsup 7^). 

n->oo i^oo j^oo 

We claim that equality holds in the above equation. Since ji < a G for all i, to prove the 
claim we may assume that a is generating. By lemma 19.21 we may assume that X = K G 
and T and a are the canonical action and partition respectively. 

Let /ij be the Markov approximation to /x induced by 7, (definition [317]) . Since a" 1 ' 1 ' < 7, 
and m(i) tends to infinity with i, lemma [9741 implies that /Xj tends to /x in the weak* topology. 
Since / is upper semi-continuous in the /x variable (lemma [9~5]) and F(/x, 7^) = F(/ij,7j) = 
/(/Xj,a) (by theorem 16.1 1) , this implies 

limsupF(/i,7i) = limsup/(/Xi,a) < /(/x,a). 

This proves the claim. Since lim^oo d(7i, A) = 0, lemma [T2T3l implies /(/x, a) = lim^oo F(/x, 
This proves the proposition in the case of F. The proof with F* replacing F is similar. □ 

13 Yuzvinskii's Addition Formula 

In this section, we prove theorem 11.51 The proof makes use of a generalization of a result 
due to R. K. Thomas [Th71] which itself is a generalization of Yuzvinskii's formula. To state 
it properly, we need some definitions. 

Definition 33. Let G = (si, . . . , s r ) and G rx T (X,B,fi). Let T be a separable compact 
group with Haar probability measure v. Let {U g } g( zG be an action of G on T by homomor- 
phisms that preserve Haar measure. 

A cocycle for the actions T and U is a measurable map ^:Gxl->r satisfying 

(f>(929i,x) = U g2 ((f)(g 1 ,x))(t)(g2,T gi x) Vg 1: g 2 eG,x e X. 

The skew product action {S g } g& G of G on (X x T, /x x v) is defined by 

S s (x, 7) = (T 5 x, ^(7)0(2, x)) VseG,iel l7 er. 

We also write S = T x^U. 

Theorem 13.1. Lei T,U,S,(f), etc. be as in the previous definition. Suppose T is either 
totally disconnected, a Lie group or a connected finite-dimensional abelian group. If there 
are finite-entropy generating partitions a,/3 for G r\- T (X, B, /x) and G rx u (T,HaariT)) 
respectively then 

f(S) = f(Tx^U) = f(T)+f(U). 
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In [Th71], R. K. Thomas proved the above theorem in the case G = Z or N without the 
finite-entropy restriction and without the restrictions on T. His proof relies on ideas from 
[Yu65] . Next let us see how theorem 11.51 follows from theorem 113.11 

Proof of theorem \1.5\ assuming theorem \13.1\ Let a : Q/J\f —> ^ be a Borel cross-section 
(i.e., a(^M) e 7-A/" for 7 e Q). Define a cocycle <f) : G x {Q/N) —> M by (j){g^M) = 
T 9 (a(7AT)) ( r(T 9 (7)A/')- 1 . Define ip : Q/N x M -> £ by fc) = ka^M). An elementary 

calculation shows that ^ conjugates the skew-action Tg/^f x^ TV with the Tg. Now apply 
theorem 113.11 □ 

Definition 34. A group T is rigid if there exists an increasing sequences £1 < £2 < • • • of 
finite partitions of V and a real number Q > such that if (a|£j) — > for all finite-entropy 
partitions a and if (£j7|£j) < Q for all i and all 7 G T. 

Theorem 13.2 (Th71, theorem 2.3). Suppose G is isomorphic to either Z or N. Let 
T, U, X, T, 6e as in definition [23- Suppose T is rigid. Let a and (3 be partitions of X 
and T respectively. Let ax/3 denote the product partition on X x V . Then 

h(T x^U.ax (3) = h(T,a) + h(U, (3). 

Proof. In theorem 2.3 of [Th71] this result is proven under the assumption that a and f3 are 
generating partitions. However, the proof yields this more general result with only minor 
obvious modifications. □ 

Proposition 13.3. Theorem \13.1\ is true whenever V is rigid. 

Proof. Let {a n } be a sequence of finite partitions of X such that a n — > a G . Similarly, let 
{fin} be a sequence of finite partitions of Y such that (3 n — > (3 . By proposition 112. II 

/(Tx+U) = lim F*{T x^U,a n x (3 n ). 

n— too 

The previous proposition and the definition of F* implies 

F*(T x U, a n x (3 n ) = F*{T, a n ) + F m (U, (3 n ) 
for any n. Now take the limit as n — > 00 and apply proposition 112.11 again to obtain 

f(T x^ U) = /.(T) + U{U) = f(T) + f{U). 

□ 

Proposition 13.4. Totally disconnected groups, compact Lie groups, and finite- dimensional 
compact connected abelian groups are rigid. 

Proof. Rigidity for totally disconnected groups and finite-dimensional connected abelian 
groups is shown in theorems 7.2 and 7.3 of [Yz65]. There is a minor error in the abelian 
case, reproduced in [Th71, theorem 2.6]. It is corrected in [LSW90, lemma B.5]. Compact 
Lie groups were proven to be rigid in [Th71, theorem 2.5]. □ 

Theorem 1 1 3 . 1 1 now follows from the above and proposition [1331 I conjecture that theorem 
113.11 (and therefore theorem I1.5P holds for all compact separable groups T. 
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